Unsupervised Chinese Personal Name Recognition Using Search Session

نویسندگان

  • Bin WEN
  • Shibin XIAO
  • Yi LUO
  • Xueqiang LV
چکیده

Personal name recognition is an important part of named entity recognition in Web search query logs. An unsupervised method for Chinese personal name recognition in queries is proposed using search session. Based on seed personal names which are produced automatically by introducing Chinese surnames, a local expansion method is proposed by using search sessions in query logs;and by modeling the process of people establishing connections with others the authors propose a candidate filtering method using candidate contexts. P@500 of personal name recognition had reached 99% on Sogou Web search query logs, and the precision of the candidate filtering method was 81.82%. The results validate the effectiveness of the proposed personal name recognition method, and prove that the filtering method has positive impact on improving the accuracy of the recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combine Person Name and Person Identity Recognition and Document Clustering for Chinese Person Name Disambiguation

This paper presents the HITSZ_CITYU system in the CIPS-SIGHAN bakeoff 2010 Task 3, Chinese person name disambiguation. This system incorporates person name string recognition, person identity string recognition and an agglomerative hierarchical clustering for grouping the documents to each identical person. Firstly, for the given name index string, three segmentors are applied to segment the se...

متن کامل

Disambiguating Personal Names on the Web Using Automatically Extracted Key Phrases

When you search for information regarding a particular person on the web, a search engine returns many pages. Some of these pages may be for people with the same name. How can we disambiguate these different people with the same name? This paper presents an unsupervised algorithm which produces unique phrases to disambiguate different people with the same name (i.e. namesakes). Our algorithm ta...

متن کامل

Extracting Key Phrases to Disambiguate Personal Names on the Web

When you search for information regarding a particular person on the web, a search engine returns many pages. Some of these pages may be for people with the same name. How can we disambiguate these different people with the same name? This paper presents an unsupervised algorithm which produces key phrases for the different people with the same name. These key phrases could be used to further n...

متن کامل

Extracting Key Phrases To Disambiguate Personal Name Queries In Web Search

Assume that you are looking for information about a particular person. A search engine returns many pages for that person’s name. Some of these pages may be on other people with the same name. One method to reduce the ambiguity in the query and filter out the irrelevant pages, is by adding a phrase that uniquely identifies the person we are interested in from his/her namesakes. We propose an un...

متن کامل

CWePS: Chinese Web People Search

Name ambiguity is a big problem in personal information retrieval, especially given the explosive growth of Web data. In this demonstration, we present a prototype Chinese Web People Search system, called CWePS. Given a personal name as query, CWePS collects the top results from the existing search engines, and groups these returned pages into several clusters. Ideally, the Webpages in the same...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013